Template-Based English-Filipino Machine Translation System

نویسندگان

  • Ethel Ong
  • Kathleen Go
  • Vince Andrew Nuñez
  • Manimin Morga
  • Francis Veto
چکیده

This paper presents a template-based machine translation system that extracts templates from a given bilingual corpus, then uses these templates to perform bi-directional EnglishFilipino translations. The system extended the similarity template learning algorithm of Cicekli and Guvenir [2] by refining existing templates and deriving templates from previously learned chunks. Chunk alignment and splitting algorithms are integrated into the training process to improve the quality of the extracted templates. Test results verified that a strict chunk alignment scheme used in the training process, including the filtering of commonly occurring words, generated better templates and chunks. Correct extraction of templates and chunks during the learning process led to reduced word and sentence error rates by as much as 50% during translation. Tests also showed that the translation with the highest score selected from a set of candidate translations is consistently the best choice as validated against automatic evaluation methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TExt Translation: Template Extraction for a Bidirectional English-Filipino Example-Based Machine Translation

In this paper, we present TExt Translation, a bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates. These templates are used for translating English input text into Filipino and vice versa. Minimal language resources and information are used since these resources are few and may contain errors. The system uses an untagged bilingual corpus, lexic...

متن کامل

Template Extraction for a Bidirectional English-Filipino Machine Translation System

A bidirectional English-Filipino Example-based Machine Translation System that learns and uses templates is presented. The system uses machine learning techniques to initially extract templates from a given bilingual corpus. These templates are subsequently used for translating English input text into Filipino and vice versa. The system implements the similarity template learning algorithm perf...

متن کامل

Learning Translation Rules for a Bidirectional English-Filipino Machine Translator

Filipino is a changing language that poses several challenges. Our goal is to develop a bidirectional English-Filipino Machine Translation (MT) system using a hybrid approach to learn rules from examples. The first phase was an English to Filipino MT system that required several language resources. The problem lies on its dependency over the annotated grammar which is currently unavailable for ...

متن کامل

Rule Extraction Applied in Language Translation

Machine translation (MT) has been used to address inherent problems from human translators. However, the quality of machine translations are usually unacceptable. Researches have focused on improving quality by incorporating machine learning for translation. An example of which is TWiRL which translates English to Filipino sentences. However, TWiRL’s approach presented a strict requirement of a...

متن کامل

Learning Translation Rules from Bilingual English - Filipino Corpus

Most machine translators are implemented using example based, rule based, and statistical approaches. However, each of these paradigms has its drawbacks. Example based and statistical based approaches are domain specific and requires a large database of examples to produce accurate translation results. Although rule based approach is known to produce high quality translations, a linguist is nec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007